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Abstract: 

Smaller feature sizes, reduced voltage levels, higher transistor counts, and re 
margins make future generations of microprocessors increasingly prone to tr< 
hardware faults. Most commercial fault-tolerant computers use fully replicatec 
components to detect microprocessor faults. The components are locksteppec 
cycle synchronized) to ensure that, in each cycle, they perform the same opei 
the same inputs, producing the same outputs in the absence of faults. Unforti 
a given hardware budget, full replication reduces performance by statically pa 
resources among redundant operations. We demonstrate that a Simuitaneot 
Redundantly Threaded (SRT) processor-derived from a a Simultaneous Mutt 
(SMT) processor-provides transient fault coverage with significantly higher p 
An SRT processor provides transient fault coverage by running identical copi 
same program simultaneously as independent threads. An SRT processor pro^ 
performance because it dynamically schedules its hardware resources among 
redundant copies. However, dynamic scheduling makes is difficult to impleme 
lockstepping, because corresponding instructions from redundant threads ma 1 
execute in the same cycle or in the same order. This paper makes four contril 
the design of SRT processors. First, we introduce the concept of the sphere of 
which abstract both the physical redundancy of a lockstepped system and the 
redundancy of an SRT processor. This framework aids in identifying the scope 
coverage and the input and output values requiring special handling. Second, 
two viable spheres of replication in an SRT processor, and show that one of tt 
provides fault detection while checking only committed stores and uncached !■ 
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a microarchitectural approach to fault tole 
in microprocessors 
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Abstract: 

This paper speculates that technology trends pose new challenges for fault tol 
microprocessors. Specifically, severely reduced design tolerances implied by c 
clock rates may result in frequent and arbitrary transient faults. We suggest t 
fault-tolerant techniques-system-level, gate-level, or component-specific appi 
either too costly for general purpose computing, overly intrusive to the desigr 
insufficient for covering arbitrary logic faults. An approach in which the microi 
itself provides fault tolerance is required. We propose a new time redundancy 
tolerant approach in which a program is duplicated and the two redundant pre 
simultaneously run on the processor: The technique exploits several significar 
microarchitectural trends to provide broad coverage of transient faults and re: 
coverage of permanent faults. These trends are simultaneous multithreading, 
and data flow prediction, and hierarchical processors-all of which are intendec 
performance, but which can be easily leveraged for the specified fault toleran« 
The overhead for achieving fault tolerance is low, both in terms of performam 
changes to the existing microarchitecture. Detailed simulations of five of the I 
benchmarks show that executing two redundant programs on the fault-tolerai 
microarchitecture takes only 10% to 30% longer than running a single versioi 
program 

Index Terms: 

coirpyjM-Mchileclyre fek'ltMemDlcpmputing redundancy, AR-SMT fauMQteiance 
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Abstract: 

This paper examines simultaneous multithreading, a technique permitting 
independent threads to issue instructions to a superscalar's multiple functions 
single cycle. We present several models of simultaneous multithreading ai 
them with alternative organizations: a wide superscalar, a fine-grain muititht 
processor, and single-chip, multiple-issue multiprocessing architectures. Our i 
that both (single-threaded) superscalar and fine-grain multithreaded archite 
limited in their ability to utilize the resources of a wide-issue processor. Ssmu 
multithreading has the potential to achieve 4 times the throughput of a sup 
and double that of fine-grain multi-threading. We evaluate several cache conf 
made possible by this type of organization and evaluate tradeoffs between th< 
show that simultaneous multithreading is an attractive alternative to singl 
multiprocessors; simultaneous multithreaded processors with a variety of 
organizations outperform corresponding conventional multiprocessors with sir 
execution resources. While simultaneous multithreading has excellent pot 
increase processor utilization, it can add substantial complexity to the design, 
examine many of these complexities and evaluate alternative organizations in 
space 
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cache configurations computational complexity computer architecture fine-grain rn uHi 
processor multiprocessing architectures multiprocessing systems on-chip parallelism 
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